Skip to content

Add bitnet-embeddings-0.6b model adaptation and GGUF conversion tools#558

Open
isHuangXin wants to merge 2 commits into
microsoft:mainfrom
isHuangXin:dev-bitnet-embedding-0.6b
Open

Add bitnet-embeddings-0.6b model adaptation and GGUF conversion tools#558
isHuangXin wants to merge 2 commits into
microsoft:mainfrom
isHuangXin:dev-bitnet-embedding-0.6b

Conversation

@isHuangXin
Copy link
Copy Markdown

  • Add GGUF conversion tool for bitnet-embeddings-0.6b (safetensors -> F16 GGUF and I2_S GGUF)
  • Add Qwen3 architecture support in llama.cpp submodule with per-projection RMSNorm
  • Add I2_S ternary quantization (2-bit packed -1/0/+1) for lossless precision
  • Add f16 norm weight support for correct embedding inference
  • Add benchmark and accuracy verification scripts
  • Add GGUF layer inspection utilities for F16 and I2_S formats
  • Add bitnet-lut-kernels.h placeholder for standalone compilation
  • Update llama.cpp submodule to dev-bitnet-embedding-0.6b branch

@isHuangXin
Copy link
Copy Markdown
Author

@microsoft-github-policy-service agree

…nversion

- Add GGUF conversion tool for bitnet-embeddings-0.6b (safetensors -> F16/I2_S GGUF)
- Add Qwen3 architecture support in llama.cpp submodule with per-projection RMSNorm
- Add I2_S ternary quantization (2-bit packed -1/0/+1) for lossless precision
- Add f16 norm weight support for correct embedding inference
- Add GGUF conversion documentation
- Update llama.cpp submodule to dev-bitnet-embedding-0.6b branch
@isHuangXin isHuangXin force-pushed the dev-bitnet-embedding-0.6b branch from f695806 to 28ef5f6 Compare May 20, 2026 17:06
…endency

- Add AVX512BW SIMD paths to ggml_vec_dot_i2_i8_s_1x1, _1xN, _Nx1
  in ggml-bitnet-mad.cpp, processing 2 I2_S blocks per iteration
  with 512-bit registers for ~2x throughput on AVX512-capable CPUs
- Guard #include "bitnet-lut-kernels.h" in ggml-bitnet-lut.cpp with
  TL1/TL2 preprocessor checks so I2_S builds no longer require
  the auto-generated header
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant